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Abstract In this paper we propose a supervised ob¬ 
ject recognition method using new global features and 
inspired by the model of the human primary visual 
cortex VI as the semidiscrete roto-translation group 
SE(2, N) = Zjv XI The proposed technique is based 
on generalized Fourier descriptors on the latter group, 
which are invariant to natural geometric transforma¬ 
tions (rotations, translations). These descriptors are then 
used to feed an SVM classifier. We have tested our 
method against the COIL-100 image database and the 
ORL face database, and compared it with other tech¬ 
niques based on traditional descriptors, global and lo¬ 
cal. The obtained results have shown that our approach 
looks extremely efficient and stable to noise, in presence 
of which it outperforms the other techniques analyzed 
in the paper. 

Keywords Descriptor • Fourier transform • hexagonal 
grid • geometric transformations ■ support vector 
machine • object recognition 

1 Introduction 

Object recognition is a fundamental problem in com¬ 
puter vision and keeps attracting more and more at- 
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tention nowadays. Its concepts have been applied in 
multiple fields, as manufacturing, surveillance system, 
optical character recognition, face recognition, etc. 

Almost every object recognition algorithm proposed 
in the literature is based on the computation of certain 
features of the image, which allow to characterize the 
object depicted and to discriminate it from others. In 
particular, since objects can appear at different loca¬ 
tions and with different sizes, it is desirable for such fea¬ 
tures to be invariant by translation, rotation and scale. 
These invariant features can be global, i.e. computed 
taking into account the whole image, or local, i.e. com¬ 
puted considering only neighborhoods of key-points in 
the image. 

In this paper we focus on Fourier descriptors, an 
important class of global invariant features used since 
the seventies [19, 40] based on algebraic properties of 
the Fourier transform. In particular, inspired by some 
neurophysiological facts on the structure of the human 
primary visual cortex, we extend this theory to define 
features invariant to translation and rotation and we 
apply them for invariant object recognition in SVM 
context. These results are then compared with those 
obtained with another important class of global invari¬ 
ant features, the moment invariants (see Appendix A), 
used e.g. in [10, 34, 35], and with two different local 
invariants. For more information on object recognition 
via local features we refer to [2, 12, 25, 27, 28, 30]. 

Our choice of a global approach is motivated by 
the better results obtained by these methods in pres¬ 
ence of noise, luminance changes and other different 
alterations, with respect to algorithms based on local 
invariant features [7]. Indeed, under these conditions, 
key-points detectors used in the local approach produce 
key-points that are not relevant for the object recogni¬ 
tion. 
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In the following we briefly introduce the theory of 
Fourier descriptors, before discussing the framework used 
in this paper and our contributions. 


1.1 Fourier descriptors 

The basic idea behind Fourier descriptors is that the 
action of an abelian locally compact group G on func¬ 
tions in L^(G) is much easier to treat at the level of 
their Fourier transforms. In the specific case of im¬ 
ages, f,gG this is expressed by the well-known 

equivalence for the translation of a S 


fix) = gix 


a) Vcc e 

^ /(A) = VA G K^, (1) 


where the Fourier transform is defined^ by 
/(A) = [ fix) dx, VA G 

In this setting, Fourier descriptors are quantities 
associated with functions of that can be eas¬ 

ily computed starting from their Fourier representa¬ 
tion and that are invariant under the action of transla¬ 
tions. Ideally, a Fourier descriptor should be complete, 
meaning that for any couple of images f,g€ the 

equality of the Fourier descriptor is equivalent to the 
equality of / and g up to translations. Indeed, the lack 
of completeness could yield to problems in applications, 
notably to false positives in the classification. 

However, a result as strong as completeness is usu¬ 
ally out of reach and unnecessary for practical appli¬ 
cations. In this case, one looks for Fourier descriptors 
that are at least weakly complete, meaning that they 
are complete on a sufficiently big subset of usu¬ 

ally either open and dense or at least residual, i.e. the 
intersection of countably many open and dense sets. 
This guarantees that the Fourier descriptor will cor¬ 
rectly classify a sufficiently large class of images. 

Various Fourier descriptors have been defined in the 
literature [19, 24, 26, 38, 40]. In this work we are mainly 
interested in the following two, whose invariance w.r.t. 
translations can be checked via (1). 

— Power-spectrum: the quantity PS/(A) := |/(A)p 
for A G which is the Fourier transform of the 
auto-correlation function 

afix) ■= / fiy) fiy + x)dy. 
dR2 

^ Here we use a non-unitary definition of the Fourier trans¬ 
form for future convenience in computations. 


It is easy to show that the power-spectrum is not 
weakly complete, and indeed it is used in texture syn¬ 
thesis to identify the translation invariant Gaussian 
distribution of textures [18]. 

— Bispectrum: an extension of the power-spectrum, it 
is the quantity BS/(Ai, A 2 ) := /(Ai) /(A 2 ) /(Ai -f A 2 ), 
or equivalently the Fourier transform of the triple 
correlation, defined as 


a3jixi,X2) ■■= / fiy) fiy + xi) fiy + X 2 )dy. 
dR2 

These descriptors are complete on compactly sup¬ 
ported functions of L^(K^) and are well established 
in statistical signal processing. See e.g. [14], where 
they are applied to sound texture recognition. 

These two Fourier descriptors can be easily general¬ 
ized to functions on (G) of a locally compact abelian 
group G to obtain invariants under the action of G. This 
can be applied, for example, to 2D shape recognition. 
However, when working with images, these descriptors 
are unsatisfying. Indeed, they are invariant only under 
translations, and so cannot be used to classify images 
under the action of rotations. 


1.2 Framework of the paper 

In this paper, following a line of research started in [38], 
we present a theoretical framework that allows us to 
build generalized Fourier descriptors which are invari¬ 
ant w.r.t. (semidiscrete) roto-translations of images. We 
exploit the following two facts: 

— It is possible to define a natural generalization of 
the power spectrum and the bispectrum on non- 
commutative groups, as it has been done in [24, 38]. 

— Contributions of some of the authors to a fairly 
recent model of the human primary visual cortex 
VI [4, 5] have shown that the latter can be mod¬ 
eled as the semidiscrete group of roto-translations 
SEi2,N) = Ijjq XI In this model, cortical stim¬ 
uli are functions in L^(5'£l(2, TV)), w.r.t. the Haar 
measure of SEi2,N), and images from the visual 
plane are lifted to cortical stimuli via a natural injec¬ 
tive and left-invariant lift operation C : 
LfiSEi2,N)). Such lift is defined as the wavelet 
transform w.r.t. a mother wavelet <F, see Section 2. 

From these facts, a natural pipeline for invariant 
object recognition is the following: 

1. Given an image / G lift it to a cortical stim¬ 

ulus Cf G L^iSEi2,N)). 

2. Compute the generalized Fourier descriptors of Cf 
on the non-commutative group S'i?(2, N). 
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3. If the lift of another image g € L^(K^) have the same 
Fourier descriptors as £/, deduce that Cf ~ Cg up 
to the action of SE{2, N). 

4. Thanks to the left-invariance and injectivity of the 
lift C, obtain that also f ~ q up to the action of 
SE{2,N). 

This pipeline was already investigated in [38] , where 
the authors considered a non-left-invariant lift, the cyclic 
lift. For this lift they then proved a weak completeness 
result of the generalized bispectrum for images, repre¬ 
sented as functions with support inside a fixed 

compact set. 

In this paper we consider the same question for left- 
invariant lifts, where the situation turns out to be more 
complicated. In particular, as explained in the follow¬ 
ing section, to ensure the weak completeness we are led 
to consider “stronger” invariants than the generalized 
bispectrum. However, as observed in Remark 4, the ac¬ 
tual computation of these stronger invariants on lifted 
images requires N times less computational time and 
space w.r.t. the computation of the generalized bispec¬ 
trum of cyclically lifted images. 

1.3 Contributions of the paper 

Let iiT C be any compact set, representing the size 
of the images under consideration. According to the 
pipeline for object recognition introduced above, the 
weak completeness of the generalized Fourier descrip¬ 
tors on images can be proved in two steps: 

1. Prove the completeness of the generalized Fourier 

descriptors on some residual set Q C x AT) of 

cortical stimuli; 

2. Prove that for some residual set TZ C L^(K^) of 
images with support in K we have C{TZ) C Q. 

The first point is addressed in Theorem 1, where 
is identified an open and dense set Q C L? {In K) 
on which the combination of the generalized power- 
spectrum and bispectrum holds. This generalizes the 
result in [38], where the same result was proved for a 
residual subset of the range of the cyclic lift. 

Unfortunately, it turns out that for this set Q and 
a left-invariant lift C there is no hope of finding a set 
TZ C satisfying the second point above. We are 

then led to introduce stronger Fourier descriptors, the 
rotational power-spectrum and bispectrum, which are 
invariant only w.r.t. rotations. To solve this problem we 
preprocess images by centering them at their barycen- 
ter, a procedure that is essential also in [38]. Theorem 2 
then shows that the resulting invariants are complete 
for an open and dense set of functions in L'^{K), for 


any compact AT C The proof of this completeness 
requires fine technical tools from harmonic analysis and 
the theory of circulant operators, and for this reason 
we only present a sketch of it, evidencing the techni¬ 
cal difficulties. A complete proof will be presented in a 
forthcoming paper by the second and last authors. 

Finally, in Theorem 3 we show that, under mild as¬ 
sumptions on the mother wavelet E, to check the equal¬ 
ity of all these Fourier descriptors it is sufficient to com¬ 
pute simple quantities computed from the 2D Fourier 
transform of the image. This allow for an efficient imple¬ 
mentation on regular hexagonal grids. After using these 
descriptors to feed a SVM based classifier, we com¬ 
pare their performances with those of Hu and Zernike 
moments, the Fourier-Mellin transform and some well- 
known local descriptors. To this purpose, we test them 
on two large databases: the COIL-IOO^ object recog¬ 
nition database, composed of 7200 objects presenting 
rotation and scale changes [31], and the ORL^ face 
database, on which different human faces are subjects 
to several kind of variations. 


1.4 Structure of the paper 

The remainder of the paper is organized as follows. In 
Section 2, we present the features of a mathematical 
model of the primary visual cortex V 1 that are essential 
to our approach. In Section 3, we introduce some gen¬ 
eralities on the Fourier Transform on the semidiscrete 
group of roto-translation SE{2, N). In Section 4, we de¬ 
scribe the natural generalization of the power-spectrum 
and the bispectrum on to SE{2, N). We then prove 
the weak completeness result (Theorem 1) and show 
that under the chosen lift operator this does not imply 
weak completeness for images. Finally, we introduce the 
rotational power-spectrum and bispectrum and sketch 
the proof of the corresponding weak completeness result 
(Theorem 2) for images. We end this section with some 
result on the practical computation of these descrip¬ 
tors. In Section 5 we illustrate some numerical results 
where these descriptors are compared with those ob¬ 
tained via global descriptors such as Zernike moments, 
Hu moments, Fourier-Mellin transform, and local ones 
like the SIFT and HoG descriptors. Finally, we conclude 
with some practical suggestions in Section 6. 


^ http://www.cs.Columbia.edu/CAVE/software/softlib/ 
coil-100.php 

^ http://www.cl.cam.ac.uk/research/dtg/attarchive/ 
facedatabase.html 
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2 A mathematical model of the primary visual 
cortex VI 

As mentioned in the introduction, the main novelty of 
our approach is its connection with a fairly recent model 
of the human primary visual cortex VI due to Petitot 
and Citti-Sarti [11, 32] and our recent contributions 
[3, 4, 5, 33]. The theory of orientation scores intro¬ 
duced in [15, 16] is also strongly connected with this 
work, in particular for its exploitation of left-invariant 
lift operators. We also mention [37], where image in¬ 
variants based on the structure of the roto-translation 
group SE{2) are introduced for textures. In this section 
we present the features of this model that are essential 
to our approach. 

Since it is well-known [23] that neurons in VI are 
sensitive not only to positions in the visual field but 
also to local orientations and that it is reasonable to 
assume these orientations to be finite, in [4] V1 has been 
modeled as the semidiscrete group of roto-translations 
SE(2, N) = Zjv X for some even iV G N. Letting Rk 
be the rotation of 27r/fc, the (non-commutative) group 
operation of SE(2,N) is 

(x, k){y, r) = {x + Rty, k + r). 

Here, we are implicitly identifying k + r with k + r 
mod N. 

Visual stimuli / G are assumed to be lifted 

to activation patterns in L'^{SE{2, N)) by a lift opera¬ 
tor C : L^(R^) —>• L‘^{SE{2,N)). Motivated by neuro¬ 
physiological evidence, we then assume that 

(H) The lift operator C is linear and is defined as 

Cf{x,k):=[ f{y)E{R_kiy - x))dy, (2) 

for a given mother wavelet W G such that C 

is injective and bounded. 

Remark 1 This assumption means that the lift opera¬ 
tor under consideration is the wavelet transform w.r.t. 
E (See, e.g., [17]). The fact that C be injective and 
bounded is then equivalent to the fact that the mother 
wavelet E is weakly admissible, i.e., is such that the map 

AgR2^ ^ \E{R_kX)\‘^ 

kG'^N 

is strictly positive and essentially bounded. 

As a consequence of the above assumption, the lift 
operation C is left-invariant w.r.t. to the action of 
SE{2,N). Namely, 

( 3 ) 


Here A and tt are the actions of SE{2,N) on 
L‘^{SE{2, N)) and respectively. That is, 

[A{x,k)(p]{y,r) 

= V {{x, (y, r)) = (p{R-k{y -x),k + r), 

[Tr{x, k)f]{y) = f ((x, k)~^y) = f{R-kiy - x)). 

Formula (3) can be seen as a semidiscrete version of the 
shift-twist symmetry [6]. 

The main observation for our purposes is that (3) 
means that two images / and g G can be de¬ 

duced via roto-translation (i.e., / = TT{x,k)g for some 
(x, k) G SE(2,N)) if and only if their lifts can be de¬ 
duced via yl(x, k). 

3 Preliminaries on non-commutative harmonic 
analysis 

In this section we introduce some generalities on the 
(non-commutative) Fourier transform on SE{2,N), an 
essential tool to define and compute the Fourier de¬ 
scriptors we are interested in. We refer to [1, 20] for a 
general introduction to the topic. 

Since SE(2,N) is a non-commutative unimodular 
group, the Fourier transform ot (p G L‘^{SE{2, N)) is 
an operator associating to each (continuous) irreducible 
unitary representations of SE{2,N) some Hilbert- 
Schmidt operator on the Hilbert space where acts. 
Here, A is an index taking values in the dual object of 
SE{2, N), which is denoted by SE{2, N) and is the set 
of equivalence classes of irreducible unitary representa¬ 
tions. 

The set of irreducible representations of a semi-direct 
product group can be obtained via Mackey’s machin¬ 
ery (see, e.g., [1, Ch. 17.1, Theorems 4 and 5]). Accord¬ 
ingly, SE{2,N) is parametrized by the orbits of the 
(contragredient) action of rotations {i?fc}fegz„ on 
i.e., by the slice 5 C which in polar coordinates is 
(0, -boo) X [0, 2n/N). Additionally, corresponding to the 
origin, there are the characters of Z^r. Namely, to each 
A G 5 corresponds the representation acting on 
via 

r^(x, k)v = diag^(e*<^’'f^'‘’">) o S'^v 

= (4) 

\ / h—0 

where we denoted by diag^ Vh the diagonal matrix of 
diagonal v G and by S the shift operator {Sv)j = 
Vj+i, so that {S^v)j = Vj+k- On the other hand, to each 
k G Wjn corresponds the representation on C given by 

• 27 rfc 

z I—>• e''~^z. Since it is possible to show that to invert 
the Fourier transform it is enough to consider only the 


A{x, k) o C = Co 7r(x, k). 
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representations parametrized by 5, we will henceforth 
ignore the In part of the dual. 

Finally, the matrix-valued Fourier coefficient of a 
function tp g L^{SE{2,N)) n L^{SE{2,N)) for A G 
SE(^N) is 

p{T^)= ( p{a)T\a-^)da, (5) 

JSE{2,N) 

where da is the Haar measure^ of SE{2, N). This is es¬ 
sentially the same formula for the Fourier transform on 
which is a scalar and is obtained using the repre¬ 
sentations A(a;) = acting on C. 

Straightforward computations yield 

p{T\,=E{p{;i-j)){R.,X), ( 6 ) 

where we let E denote the Fourier transform on 

As usual, the definition of the Fourier transform 
can be extended to the whole L‘^{SE{2,N)) by den¬ 
sity arguments. Then, there exists a unique measure 
on SE{2,N), the Plancherel measure, supported on S 
where it coincides with the restriction of the Lebesgue 
measure of R^, such that the Fourier transform is an 
isometry between L'^{SE(2, N)) and L‘^{SE{2, N)). In 
particular, the following inversion formula holds 

ip(x,k) = J Tr ((^(T'*') o T^(a;, fc)) dX. 

The fundamental property of the non-commutative 
Fourier transform, generalizing (1), is that for all p,ri € 
L^{SE{2,N)) and a G SE{2,N) it holds 

(p{x, k) = [A{a)rj]{x, k) V{x,k) € SE(2,N) 

p{T) = fj{T) o T-\a) VT G SEi^N). (7) 

Namely, p, rj can be deduced via the action of SE{2, N) 
if and only if their Fourier transforms at a representa¬ 
tion T can be deduced via multiplication by T(a). 

Remark 2 The fact that the Fourier transform in (5) 
be matrix-valued is a direct consequence of SE{2,N) 
being a Moore group, that is, that all the act on 
finite-dimensional spaces. This is not true for the roto- 
translation group SE{2). As a consequence, the Fourier 
transform on SE(2), takes values not in the finite di¬ 
mensional space of complex N x N matrices, but in 
the infinite dimensional space of operators over 
This is indeed the main theoretical advantage of con¬ 
sidering SE{2,N). 

^ That is, up to a multiplicative constant, the only left 
and right invariant measure on SE{2, N). One can check that 

/sB(2,Af) ~ ^fc=0 dx. 


3.1 Decomposition of tensor product representations 

Proofs of Section 4, will use a well-known fact on tensor 
product representations: the Induction-Reduction The¬ 
orem. (See [1]). This theorem allows to decompose the 
tensor products of representations acting on 

to an equivalent representation act¬ 
ing on C^, which is a block-diagonal operator 

whose block elements are of the form ^ More¬ 

over, the linear transformation realizing the equivalence 
is explicit. 

To avoid confusion, we will henceforth denote com¬ 
ponents of vectors v G as u(0),... ,v{N — 1), ele¬ 
ments of {wk)kGZN where Wk G C^, and 

the components of vectors v G (8) as v(fc, h) for 
k,h = 0,...,Af — 1. We also remark that linear op¬ 
erators B on can be decomposed as B = 

where each B’^A [s an N x N complex 
matrix. Namely, we have 

B{wk)k&^ = ( ^ B^’Wj . ( 8 ) 

\h=o /fcezw 

Then, exploiting the commutation of the Fourier 
transform with equivalences of representation, the 
Induction-Reduction Theorem implies that for every 
p G L'^{SE{2, N)) and any Ai, A 2 G 5 it holds 

A o p(T^^ 0 oA-^= 0 ( 9 ) 

Here, A : 0 ^kez^ given by 

{Av)k{h) = {Akv){h) = v(h, h — k), Vv G 0 . 

( 10 ) 

4 Fourier descriptor on SE{2,N) 

In the following sections we introduce and study the 
Fourier descriptors on the group SE{2,N). As already 
mentioned, proving a general completeness result is es¬ 
sentially hopeless, and we will content ourselves to prove 
the weak completeness. 

Let AT C R^ be a compact set. In the following we 
will be mainly concerned with functions that are com¬ 
pactly supported either in iG or in AT x Zn C SE{2, N). 

4.1 Generalized Fourier descriptors 

Following [38] , the power spectrum and the bispectrum 
on R^ can be generalized to SE{2,N) as follows. 
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Definition 1 The generalized power-spectrum and his- 
pectrum of ip G LF‘{SE{2, N)) are the collections of ma¬ 
trices for any A, Ai, A 2 G S, 

PS^(A) := piT^)o;p{T^r 

BS^(Ai,A2) 

The next result generalizes, with a simplified proof, 
the result presented in [38]. Let us mention that this 
result is indeed true in a more general setting, as it 
will be shown in a forthcoming paper by Prandi and 
Gauthier. 

Theorem 1 Let K C be a compact. The gener¬ 
alized power-spectrum and bispectrum are weakly com¬ 
plete on L‘^{'Ln X K). In particular, they discriminate 
on the open and dense set G C x K) of functions 

(p supported in x K and whose Fourier transform 
(p{T^) is invertible for an open and dense set of X’s. 
That is, pi,p 2 G G are such that PS,^j = PS ,^2 and 
BS,^^ = BS ,^2 if and only if pi = A(x, k)p 2 for some 
{x,k) G SE{2,N). 

Proof The fact that G is open and dense is proved in 
Lemma 1 in Appendix B. Let p,rj G G he such that 
= PS ',^2 and BS,^ = BS^^. The equality of the 
generalized bispectrum implies that the set of A’s where 
‘p{T^) and fi{T^) fail to be invertible is the same. We 
will denote it by I and let 

U{T^) ;= p{T^)-^fi{T^) VA e I. 

In order to complete the proof of the statement, we will 
prove that U{T^) can be defined for all A’s in and, 
moreover, that U{T^) = T^{a) for some a G SE{2,N). 
Indeed, by (7) this will readily implies that p = A{a)rj 
as announced. 

We claim that U{T^) is unitary for all X G I. Indeed, 
by the equality of the generalized power-spectrum we 
have 

U{T^yU{T^) = r]{Ty*VS^{X)r]{Ty = 1 . 

Observe that the equality of the generalized bis¬ 
pectrum and the definition of U, imply that for all 
Al, A 2 G / it holds 

BS,^(Ai, A2) = 

) G ) o ) (g) U{T^^) o r]{T^^ (^T^y*. 

By the invertibility of ip{T^^)iSipiT^^) and the unitarity 
of U, this yields 

>p{T^^ ®T^you{T^y®u{T^y = fi{T^^ ®T^y. (ii) 

The announced result is then a consequence of the 
following three facts, which are proved in Appendix B. 


1. Lemma 2: The function A 1 —C/(T^) is continuous 
on /. 

2. Lemma 3: The function A i-A t/(T^) can be extended 
to a continuous function on for which (11) is still 
true. 

3. Lemma 4: There exists a G SE{2,N) such that 

U{Ty=T^{a). □ 

An immediate corollary is the following. 

Corollary 1 Let C, : L^(]R^) — )• LA{SE{2,N)) he an 
injective lift operator (not necessarily satisfying {2)). 
Assume that there exists a residual set TZ C 
such that jC{TV) f] G is residual. Then, the generalized 
power-spectrum and bispectrum are weakly complete on 
L^(]R^). Namely, for any f,gGlZit holds that BS^^ = 
BS ^^ if and only if f = tt{x, k)g for some {x, k) G 
SE{2,N). 

Remark 3 In [38], the authors applied their version of 
Theorem 1 to a non-left-invariant lift C, called cyclic 
lift. Indeed, for this cyclic lift, when N is odd, it is pos¬ 
sible to prove that, for any compact AT C there 
exists a residual set TZ C L‘^{K) satisfying the assump¬ 
tions of Corollary 1. 

Unfortunately, Corollary 1 can never be applied to 
lifts of the form (2). In fact, as proved in Appendix C, 
letting ujf{X) := if{R-kX))^SQ G , we have that 

£/(T^) =cc^(A)*®a;/(A)*, (12) 

where for v,w G , we let v* = {vk)k and {v^w)k,h = 
Vk Wh, so that {v (8> w)u = {w, u) v for all u G . This 
immediately implies that rank Cf(Ty < 1 and hence 
that range £ fl t/ = 0 whenever iV > 1. 

4.2 Rotational Fourier descriptors 

To bypass the difficulty posed by the non-invertibility 
of the Fourier transform for lifted functions, we are led 
to consider the following stronger descriptors. 

Definition 2 The rotational power-spectrum and bis¬ 
pectrum of p G L^{SE{2, N)) are the collections of ma¬ 
trices, for any A, Ai, A 2 G S and h G Zjv, 

RPS p{X, h) := piT^'^y o p{Ty* 

rbs^(Ai,A2,/i) ■.= p{T^^^y ® p{T^y o p{T^^ <^T^y*. 

As already mentioned in the introduction, the rota¬ 
tional descriptors are invariant only under the action of 
Zjv C SE{2,N) but not under translations. To avoid 
this problem, let us fix a compact AT C and consider 
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the set A C L^(R^) of functions compactly supported 
in K, with non-zero average®. Observe that this is an 
open and dense subset of L'^{K). We can then define 
the barycenter c/ € of / S ^ as 

Cf = —T ( / xif{x)dx, / X2fix)dx] , 
avg / V7 r2 Jr 2 J 

and the centering operator ^ ^ ^ as 

^f{x) ■■= f{x-Cf). ( 13 ) 

Then, considering the centered lift Cc = Co<P, we have 
that Ccf = CcQ if and only if g is a translate of /. In 
particular, 

Ccf = 7l(0, 

/ = '^( 2 ^) k)9 for some x G 

Let us consider the following set of functions. 

Definition 3 Let TZ C be the set of real-valued 

functions / supported in K, such that /(A) 0 for a.e. 

A G and the family 17/ = is a basis 

for C^, if N is odd, or, if N is even, for 

X = {v€C^ \ v{h) = v{h + N/2) Vh G Z^}. 

The dependence of this definition on the parity of 
N comes from the well-known fact that /(A) = /(—A). 
Indeed, for N even, this implies that S^u}f{X) G X for 
any k G Zjv. As such, there is no hope for the family 
17/ to generate the whole C^. 

Finally, we have the following Theorem. 

Theorem 2 For any compact K C if the mother 
wavelet W £ TZ, the rotational power-spectrum and his- 
pectrum are weakly eomplete on L^{K) n A. Namely, 
the set TZ is open and dense in L^{K) and for any 
f,g G TZ F A it holds that RPSc^/ = RPS^^g and 
= RBS£^g if and only if f = tt { x , k)g for some 
{x,k) G SE{2,N). 

Here, we content ourselves to present only a sketch 
of the proof of this result for the case N odd. The par¬ 
ity of N does not introduce essential problems, up to 
exploit the fact that range Lf(T^) C X for all / G 72. 
and A G <S and that the equivalence A of the Induction- 
Reduction Theorem quotients nicely to an equivalence 
between X x X and X. However, in order to 

prove the key technical point (14) we need a much finer 
study of the properties of circulant operators, which is 
outside the scope of this work and we defer to a forth¬ 
coming paper by Prandi and Gauthier. 

® Recall that the average of / : —>■ R is avg / = 

/ju 2 f{x) dx, which is always well-defined for L^(R^) functions 
with compact support. 


Proof (Sketch in the case N odd) The fact that TZ is 
open and dense in Lf‘{K) follows from the same argu¬ 
ments in Lemma 1. 

Let Circu be the circulant matrix associated with 
V, that is, Circu = [x, Sv, ..., Then the condi¬ 

tion on 17/ for f £ TZ is equivalent to the invertibility 
of Circa;/(A) for an open and dense set of A’s. By the 
properties of the Fourier transform on R^ w.r.t. trans¬ 
lations it follows that 

w<?./(A) = diagfc 

This entails that Circa;/(A) is invertible if and only if 
Circa;g>/(A) is. Hence, the statement is equivalent to 
the fact that for any couple f,g£TZwe have RBS^/ = 
RBS^g if and only if / = Rkg for some k £ Z^r. 

The proof is similar to the one of Theorem 1, but 
with additional technical difficulties. Let / be the set 
where Circw/(A) and Circa;g(A) are invertible. By as¬ 
sumption I is c^en and dense. To overcome the non- 
invertibility of Cf in the definition the candidate inter¬ 
twining representation U, we exploit the invertibility of 
the circulant matrices Circa;/(A) and Circa;g(A) on an 
open and dense set. Namely, for any X £ I we let 

U{T^y :=CircWg(A) (Circw/(A))"\ 

By definition, U(Ty is circulant and 17(T^)*5'^a;/(A) = 
5'^Wg(A) for any k £ Zjy. Moreover, by (12), this is 
equivalent to 

£/(T^''^)[/(r^) =£5(r«'=^), \/k£ZN. 

In particular, X U (T^) is constant on orbits {RkX}kei,N ■ 
Finally, U{Ty is unitary as a consequence, e.g., of The¬ 
orem 3. 

The main difficulty in the proof is now to derive the 
equivalent of identity (11), that is, that for an open and 
dense set of couples (Ai,A 2 ) we have 

CjtypRkM 0 ® UiT^y 

= Cg{T^'‘^^ ® V/c G Zat. (14) 

As already mentioned, the proof of this identity requires 
a deep use of properties of circulant operators, which is 
outside the scope of this paper. We thus defer it to a 
forthcoming paper. 

Once (14) is known, the statement follows applying 
the same arguments as those in Theorem 1. Namely, 

1. The function A C/(T^) is continuous on I. This 
can be done via the same arguments as in Lemma 2. 

2. The function A 1 —>■ [/(T^) can be extended to a con¬ 
tinuous function on S still satisfying (14). This can 
be done exactly as in Lemma 3. 
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3. There exists k € I^n such that U{T^) = T^(0, k). lift obtained in [38]. Indeed, in that work is proved that 

This is proved following Lemma 4. Indeed, the fact the latter (for odd N) is determined exactly by the 

that now A i— >• 17 {T^) is constant on the orbits {i?fcA}j,g 2 ^uantities, for a.e. Ai, A 2 G S and h,k G 
implies that the (pk^s obtained there have to be in¬ 


dependent of k. Since (pkW = for some 

xo S this implies that xq = 0 and hence (pk =0- 
Obviously, this proves that U{T^) = = T^{0, k), 

for some k G Zjv. □ 


7Xi ,A2 


{uj^f{RhXi) © W0/(iifeA2),a;0/(Ai -I- Rh+k^2))- 

In particular, for each Ai,A 2 G S one has to compute 
N times more quantities than those for the rotational 
bispectrum. 


4.3 Practical computation of the Fourier descriptors 

Here, we present some explicit formulae for the compu¬ 
tation of the Fourier descriptors presented in this sec¬ 
tion. 

In the following, we show that, under some assump¬ 
tions on the mother wavelet >F, the concrete computa¬ 
tion of the generalized power-spectrum and bispectrum 
and of their rotational counterparts, depend only on the 
2D Fourier transform of /. 

Theorem 3 Assume that the mother wavelet 'R G TZ. 
Then: 

— For any f G TZ, the generalized power-spectrum and 
bispectrum of Cf are respectively determined by the 
quantities, for a.e. A, Ai, A 2 G S, 

N-l 

Iiif) = lk/(A)f = E l/(^-feA)P 

= (a;/(Ai) 0 a;/(A 2 ),ccj(Ai + A 2 )) 

N-l _ 

= ^ /(i?_fcAi)/(i?_feA 2 )/(i?-fc(Ai + A 2 )). 
k=0 

— For any f G A CiTZ, the rotational power-spectrum 
and bispectrum of Ccf are respectively determined by 
the quantities, for a.e. A, Ai,A 2 G S and h G "Ln, 


As a corollary of Theorem 3 we show that, in or¬ 
der to compare the power-spectra and bispectra, it is 
usually enough to compare only the latter. 

Corollary 2 Let F G TZ and f,g G TZ D A. Then, if 
Cf and Cg have the same generalized (resp. rotational) 
bispectrum, they have also the same generalized (resp. 
rotational) power-spectrum. 

Proof We only prove the result for the rotational de¬ 
scriptors. In order to prove the one for the generalized 
descriptors, it will be enough to fix h = 0 in the follow¬ 
ing. By Theorem 3 it is enough to show that whenever 

Ai,A 2 G S and any 
h G Zn, then I^’^if) = for a.e. X G S and 

any h G I^n- We start by observing that by the Paley- 
Wiener Theorem all these quantities are analytic, since 
/ and g are compactly supported. Moreover, 

hm = iV/(0)|/(0)p = N avg(/)3, 

Al, A24,U 

and the same is true for g. Thus, avg(/) = avg{g). 
Finally, the result follows observing that 

liin^-^=’^(/)=avg(/)/^’^/).n 

A24'U 

5 Experimental results 


12 '^ if) = (w<?>/(-RftA), ws./(A)) 

N-l _ 

= ^ f{R-k+hX)f{R-kX), 

k^Q 

^ 0 W<,/(A2),W^/(Ai -f A2)) 

N-l _ 

= E fi^ — k-\-h Ai)/(i?_fcA2)/(i?-fe(Ai + A2)). 

Here, ^ \ A ^ A is the centering operator defined in 
(13). 

Remark 4 Theorem 3 shows in particular that the re¬ 
sult of Theorem 2 is indeed stronger than the complete¬ 
ness result for the generalized bispectrum of the cyclic 


The goal of this section is to evaluate the performance 
of the invariant Fourier descriptors defined in the previ¬ 
ous section on a large image database for object recog¬ 
nition. In addition to the generalized power-spectrum 
(PS) and bispectrum (BS) and the rotational power- 
spectrum (BPS) and bispectrum (BBS), we also con¬ 
sider the combination of the BPS and BS descriptors. 
Indeed, combining these two descriptors seems to be a 
good compromise between the theoretical result of com¬ 
pleteness given by Theorem 2, which only holds for the 
BBS, and computational demands, as the results on the 
COIL-100 database will show. 

After showing how to efficiently compute these de¬ 
scriptors and presenting the image data set, we ana¬ 
lyze some experimental results. In order to estimate 
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the features capabilities, we use a support vector ma¬ 
chine (SVM) [39] as supervised classification method. 
The recognition performances of the different descrip¬ 
tors regarding invariance to rotation, discrimination ca¬ 
pability and robustness against noise are compared. 

5.1 Implementation 

As proved in Theorem 3, the equality of the Fourier de¬ 
scriptors we introduced does not depend on the choice 
of the mother wavelet W. Accordingly, in our implemen¬ 
tation we only computed the quantities introduced in 
Theorem 3, whose complexity is reduced to the efficient 
computation of the vector ujf{X), for a given X G S. 
We recall that this vector is obtained by evaluating the 
Fourier transform of / on the orbit of A under the action 
of discrete rotations R-k for k G 

Let us remark that, although in our implementation 
we chose this approach, in principle fixing a specific 
mother wavelet could be useful to appropriately weight 
descriptors depending on the associated frequencies. In¬ 
deed, preliminary tests with a Gabor mother wavelet 
(which can be easily shown to be in TV) showed slightly 
better results at a bigger computational cost. 

For the implementation we chose to consider N = 6 
and to work with images composed of hexagonal pixels. 
There are two reasons for this choice: 

— It is well-known that retinal cells are distributed in a 
hexagonal grid, and thus it is reasonable to assume 
that cortical activations reflect this fact. 

— Hexagonal grids are invariant under the action of Zg 
and discretized translations, which is the most we 
can get in the line of the invariance w.r.t. SE{2, 6). 
Indeed, apart from the hexagonal lattice, the only 
other lattices on which are invariant by some Z n 
and appropriate discrete translations are obtained 
with N = 2,3, 4. 

The different steps of computation of the descriptors® 
are described in Figure 1 and given as follow: 

1. The input image is converted to grayscale mode, 
the Fourier transform is computed via FFT, and the 
zero-frequency component is shifted to the center of 
the spectrum (Fig 1. SI). 

2. For cost computational reasons and since we are 
dealing with natural images, for which the relevant 
frequencies are the low ones, we extract a grid of 
16 X 16 pixels around the origin (Fig 1. S2). 

® MATLAB sample code for the implementa¬ 
tion of the rotational bispectral invariants can be 
found at https://nbviewer.jupyter.org/github/ 

dprn/bispectral-invariant-svm/blob/master/ 
Invariant_computation_matlab.ipynb 


Table 1 Dimension of the feature vectors for the Fourier 
descriptors under consideration 


Descr. 

Dim. 

PS 

136 

BS 

717 

RPS 

816 

RBS 

4417 

RPS -h BS 

1533 


3. The invariants of Theorem 3 are computed from the 
shifted Fourier transform values, on all frequencies 
in an hexagonal grid inside this 16x16 pixels square. 
A bilinear interpolation is applied to obtain the cor¬ 
rect values of a;/(A) (Fig 1. S3, S4, S5, S6). The final 
dimension of the feature-vector is given in Table 1. 


5.2 Test protocol 

We use the Fourier descriptors to feed an SVM classi¬ 
fier, via the MATLAB Statistics and Machine Learn¬ 
ing Toolbox, applying it on a database of 7200 ob¬ 
jects extracted from the Columbia Object Image Li¬ 
brary (COIL-100) and a database of 400 faces extracted 
from ORL face database. Finally, we compare the re¬ 
sults obtained with those obtained using traditional de¬ 
scriptors. 

The result of the training step consists of the set of 
support vectors determined by the SVM based method. 
During the decision step, the classifier computes the 
Fourier descriptors and the model determined during 
the training step is used to perform the SVM decision. 
The output is the image class. 

For COIL-100 database, two cases are studied: a 
case without noise and another with noise. In the first 
one, tests have been performed using 75% of the COIL- 
100 database images for training and 25% for testing.In 
the second one, we have used a learning data-set com¬ 
posed of all the 7200 images (100 objects with 72 views) 
without noise and a testing data-set composed of 15 
randomly selected views per object to which an addi¬ 
tive Gaussian noise with Sd of 5, 10 and 20 was added. 
(See Fig. 4). 

We evaluate separately the recognition rate obtained 
using the four previous invariant descriptors and the 
combination of the RPS & BS invariants to test their 
complementarity. Then, we compare their performance 
with the Hu’s moments (HM), the Zernike’s moments 
(ZM), the Fourier-Mellin transform (FM), described in 
Appendix A, and the local SIFT and HOG descriptors 
[12] whose performance under the same conditions has 
been tested in [7], 
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(P5) 

CBS) = {(0f(A,)<0f(X.).C0fa^ + X^)) 

(BBS) +'> 2 )) 

(BPS) = {o,(R^X}.o,,(X)) 



a 


Ma 



Fig. 1 Steps of computing the invariant descriptors. (SI) computation of the shifted FFT of the the image /, (S2) generation 
of the hexagonal grid, (S3) extraction of different hexagons, (S4) evaluation of the FFT of / on each extracted hexagon, (S5) 
generation of the vector tOf{\) and (S6) computation of the four invariants 


Since we use the RBF kernel in the SVM classifi¬ 
cation process, this depends on the kernel size a. The 
results presented here are obtained by choosing empiri¬ 
cally the value tJopt that provided maximum recognition 
rate. 

5.3 Experiments 

The performances of the different invariant descriptors 
are analyzed with respect to the recognition rate given a 
learning set. Hence, for a given ratio, the learning and 
testing sets have been built by splitting randomly all 
examples. Then, due to randomness of this procedure, 
multiple trials have been performed with different ran¬ 
dom draws of the learning and testing set. In the case of 
an added noise, since as mentioned before the learning 
set is comprised of all images, this procedure is applied 
only to the testing set. 

The parameters of our experiments are the follow¬ 
ing: 

1. The learning set Ci corresponding to the values of 
an invariant descriptor computed on an image from 
the database; 

2. The classes Ci G {1,100} corresponding to the ob¬ 
ject class. 

3. Algorithm performance: the efficiency is given 
through a percentage of the well recognized objects 
composing the testing set. 

4. Number of random trials: fixed to 5. 

5. Kernel K: a Gaussian kernel of bandwidth a is cho¬ 
sen 

K(x,y) = e ^ 


X and y correspond to the descriptors vectors of ob¬ 
jects. 

For solving a multi-class problem, the two most pop¬ 
ular approaches are the one-against-all (OAA) method 
and the one-against-one (OAO) method [29]. For our 
purpose, we chose an OAO SVM because it is substan¬ 
tially faster to train and seems preferable for problems 
with a very large number of classes. 

5.3.1 COIL-100 databases 

The Columbia Object Image Library (COIL-100, Fig. 
2) is a database of color images of 100 different objects, 
where 72 images of each object were taken at pose in¬ 
tervals of 5°. 

Classification performance 

Table 2 presents results obtained testing our object 
recognition method with the COIL-100 database. The 
best results were achieved using the local SIFT descrip¬ 
tor. The RBS comes in the second place and the local 
HOG features come third. Indeed it has been demon¬ 
strated in the literature, these local methods currently 
give the best results. However, if noise is added on the 
image, the use of global approach is better than the use 
of local ones. The main reason is that the key-points de¬ 
tector used in the local method produce in these cases 
many key-points that are nor relevant for object recog¬ 
nition. This will be shown in the next subsection. 

In Figure 3 we present the recognition rate as a func¬ 
tion of the size of the training set. As expect, this is an 
increasing function and we remark that the RBS and 
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Fig. 2 Sample objects of COIL-100 database 


std = 5 std = 10 std = 20 



Fig. 4 Sample of COIL-100 noisy object 

a global descriptor such as RBS, BS, the combination 
of BS & RPS, ZM, HM and FM. It has however a sen¬ 
sible effect on the SIFT local descriptor, and a big one 
on the HOG local descriptor. 


Table 2 Recognition rate for each descriptor using the 
COIL-100 database. The test results for ZM, HM, FM, and 
SIFT are taken from [7]. 


Descriptors 

Recognition rates 

RBS 

95.5% 

BS 

88% 

PS 

84.3% 

RPS 

89.8% 

RPS-hBS 

92.8% 

ZM 

91.9% 

HM 

80.2% 

FM 

89.6% 

HOG 

95.3% 

SIFT 

100% 



-♦-RBS 

-■-BS 

-A- PS 
^^RPS 
— I^RPS+BS 

-•-ZM 

HM 
- FM 


Fig. 3 Classification rate for different size of the training 
database. The test results for ZM, HM, FM, and SIFT are 
taken from [7]. 


the combination of the RPS and the BS give better 
results than the other global invariant descriptors. 

Robustness against noise 

Also in this case, test results for ZM, HM, FM, and 
SIFT are taken from [7]. 

Results presented in Table 3 show that noise has lit¬ 
tle influence on classification performance when we use 


5.3.2 The ORL database 

The Cambridge University ORL face database (Fig. 5) 
is composed of 400 grey level images of ten different 
patterns for each of 40 persons. The variations of the 
images are across time, size, pose and facial expression 
(open/closed eyes, smiling/not smiling), and facial de¬ 
tails (glasses/no glasses). 



Fig. 5 Face samples from the ORL database 


In the literature, the protocol used for training and 
testing is different from one paper to another. In 
[36], a hidden Markov model (HMM) based approach 
is used, and the best model resulted in recognition rate 
of 95%, with high computational cost. In [21], Hjelmas 
reached a 85% recognition rate using the ORL database 
and feature vector consisting of Gabor coefficients. 

We perform experiments on the ORL database using 
the RBS, BS, PS, RPS, ZM, HU, FM, and the combi¬ 
nation of the RPS & BS descriptors. Since the local 
descriptors SIFT and HOG obtained, predictably, al¬ 
most perfect scores, we do not present them. The re¬ 
sults are shown in Table 4, where we clearly see that 
the RBS invariant descriptor gives the best recognition 
rate c = 89.8%, faring far better than before w.r.t. the 
combination of RPS and BS descriptors. 
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Table 3 Classification rate on COIL-100 noisy database. The test results for ZM, HM, FM, and SIFT are taken from [7]. 


Sd 

RBS 

BS 

PS 

RPS 

RPS-hBS 

ZM 

HM 

FM 

SIFT 

HOG 

5 

100% 

100% 

71.5% 

99.8% 

100% 

100% 

95.2% 

98.6% 

89.27% 

4% 

10 

100% 

100% 

71.2% 

99.8% 

100% 

100% 

95.2% 

95.2% 

88.89% 

1 . 2 % 

20 

100% 

100% 

67.8% 

99.8% 

100% 

100% 

91.4% 

90.2% 

85.46% 

1 % 


Table 4 Recognition rate for each descriptor using the ORL 
database 


Descriptors 

Recognition rates 

RBS 

89.8% 

BS 

67.9% 

PS 

49.2% 

RPS 

76.9% 

RPS-kBS 

79.8% 

ZM 

75% 

HM 

43.5% 

FM 

47.6% 


6 Conclusion and perspectives 


descriptors outperforming the local ones. These results 
thus show the rotational bispectrum (RBS) to be a very 
good Fourier descriptor for object recognition, consis¬ 
tently with the theoretical weak completeness result. 
When the dimension of the feature vector is an issue, 
the RBS can be replaced by a combination of the gener¬ 
alized bispectrum (BS) and the rotational power-spectrum 
(RPS), which yields slightly worse results with a feature 
vector of length almost one third. 

An extension of the object recognition method pre¬ 
sented in this paper to an AdaBoost framework for the 
problem of object detection is currently undergoing. 


In this paper we presented four Fourier descriptors over 
the semidiscrete roto-translation group SE{2, N). Then, 
we proved that the generalized power-spectrum (PS) 
and bispectrum (BS) - and thus the rotational power- 
spectrum (RPS) and bispectrum (RBS) - are weakly 
complete, in the sense that they allow to distinguish 
over an open and dense set of compactly supported 
functions ip £ L'^{SE{2, N)) up to the SE(2, N) action. 
This generalizes a result of [38]. We then considered a 
framework for the application of these Fourier descrip¬ 
tors to roto-translation invariant object recognition, in¬ 
spired by some neurophysiological facts on the human 
primary visual cortex. In this framework, we showed 
that the rotational bispectrum is indeed a weakly com¬ 
plete roto-translation invariant for planar images. More¬ 
over, although the proposed Fourier descriptors are given 
in terms of complex mathematical objects, we showed 
that they can be implemented in a straightforward way 
as linear combinations of the values of the 2D Fourier 
transform of the image. 

In the second part of the paper, we proposed an eval¬ 
uation of the performances of these Fourier descriptors 
in object recognition and we presented the results ob¬ 
tained on different databases: the COIL-100 database, 
composed of several objects undergoing 3D rotation and 
scales changes, and the ORL-database, on which differ¬ 
ent human faces are subject to several kind of vari¬ 
ations. For both these databases, the global Fourier 
descriptors introduced in this paper are the most effi¬ 
cient global descriptors tested, equalled only, for noisy 
images, by the Zernike Moments. Although for unper¬ 
turbed images the local SIFT descriptor gives better 
recognition rate, the addition of noise leads to the global 


A Moment invariants and Fourier-Mellin 
transform 


In this section, following [7], we review the two most used 
classes of moment invariants, Hu and Zernike, and Fourier- 
Mellin descriptors, that we use as a comparison for our gen¬ 
eralized Fourier descriptors. 

Moment invariants were first introduced to the pattern 
recognition and image processing community in 1962 by Hu 
[22] , with the introduction of the seven Hu moments which are 
invariants under translation, rotation and scaling. These are 
derived from a scaling and translation invariant modification 
of the standard moments of an image 7 : —>■ R. Namely, 


Vp^q 


^p,q 


U 


0,0 


where 



yoYI(x,y)dxdy, 


and xn = IRiiS. and yn = are the coordinates of the 

^ 0,0 " 1 - 0,0 

barycenter computed via the standard (p -|- q)-th order mo¬ 
ments of I \ 

JUp.q = / x^y‘^I{x,y)dxdy. 

Another important class of moments are the Zernike ones, 
introduced in [9] and computed via orthogonal Zernike poly¬ 
nomials. The Zernike moment of order (m, n) is: 

Zmn = 'y ] 'y ] y) [ymn{x, y)] 

n ^^—' 

X y 

where x^ y^ < 1 and Vmn{x,y) are the Zernike polynomi¬ 
als defined in polar coordinates as Win(?', d) = , 

where 


Rr, 


i(^) = 




^0 s!(- 


i-H"! _ "1-1"| 


-s)! 
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These moments present several advantages. Indeed, beside 
a rotation and translation invariance they have nice orthog¬ 
onality properties and are considered to be robust against 
image noise. In particular, the orthogonality property helps 
in achieving a near zero value of redundancy measure in a set 
of moments functions [ 8 ]. 

Finally, strictly related to Fourier descriptors are the de¬ 
scriptors obtained via Fourier-Mellin transform (FMT), pre¬ 
sented in [13]. The FMT of an image I, that we assume to be 
given in polar coordinates, is defined as: 

1 f2TV foo 

Mi{u,v) = — / d6i. 

27r Jo Jo r 

Following [13], we will indeed compute the analytical Fourier- 
Mellin transform (AFMT). That is, we replace / in the above 
definition with its regularized version /^(r, 6 ) = r°'/(r, 0 ), 
where cr > 0. Finally, each feature is modified in 

order to compensate for the rotation, translation and size 
changes of the object. 


B Auxiliary lemmata for the proof of 
Theorem 1 

Lemma 1 The set Q introduced in Theorem 1 is open and 
dense in L^{K x Zjv). 

Proof We start by showing that Q ^ 0. To this aim, it suffices 
to consider (p such that ip{-, k) = 0 for all fc G Zjv \ { 0 } and 
(^(■,0) ^ 0 such that supp 0)) = R^. By ( 6 ), we then 

have ifi & Q, since 

det<p(r^)= JG(vj(-,0))(fl;_feA) VAe5. 

k 


Proof Fix Aq £ 7 and an open set T C R^ such that 

f u{T^^)*jdX2>0 Vi,iGZjv. 

Jv 

This is possible since U ^ 0. Since the set I is open dense, up 
to reducing V we can assume that there exists a neighborhood 
W of Aq such that T-|-A C / for any A G W. Then, (15) holds 
for Ai £ W and A 2 G V. Explicitly computing the 0, 0 block 
of (15), we have 

U{T^^)i,jUiT^^)*j = U{T^^ + ^^)ij yi,j G In- 
Then, integrating it over V w.r.t. A 2 yields 




Iv + xU{T^^)ijdX2 
Iv dX 2 


VAi G W, yi,j G Zn 


Since the function on the r.h.s. is clearly continuous on W 
this proves the continuity at Aq of A 1 —>■ U{T^), completing 
the proof. □ 


Lemma 3 The function X 1 -^ U{T^) can be extended to a 
continuous function on R^ for which ( 11 ) is still true. 


Proof Let Aq ^ I. Since / is an open and dense set, this im¬ 
plies that Aq is in its closure and that we can choose Ai, A 2 £ I 
such that Aq = Ai-|-i?fc„A 2 for some ko £ Zjv and Ai-bi?feA 2 G 
I for any k ^ ko- We then let 

U[T^«) := (Ao [/(T^i) ® U[T^^)oA*)'°°''‘° . (16) 

We now prove that the above definition does not depend 
on the choice of Ai, A 2 and ko - By openness of 7, there exists 
a neighborhood V of A 2 entirely contained in 7. Then, up to 
taking a smaller V, it holds that Ai +RkjjX 2 £ 7 for any Aj G 
V \ {A 2 }. By (15), this implies that for any fii fi- Rtpb 2 = Aq 
it holds 


For any ip E Q and k G Zjv, the Paley-Wiener Theorem 
implies that T{p{-,k)) is analytic. In particular, by ( 6 ), A 1 —^ 
det ip{T^) is analytic. Thus, p E G and only if p{T^o'^ is 
invertible for some Aq G S. 

We claim that the set G is dense. Indeed, let p ^ G and 
fix some rj E G and Aq G 5 such that rj(T^'>) is invertible. 
By analyticity of e 1 -^ det((^(T^‘>) + ef]{T^°)) follows that 
p + erj E G for sufficiently small e > 0 , which entails that 
p (E G, proving the claim. 

Let us prove that G is open in (7C x Zjv). To this aim, fix 
p EG and pn. p In L^{K xZjv). This implies that pn —t p 
in L^{SE{2, N)), and in particular that pn p in measure. 
By definition of convergence in measure, this implies that 
for sufficiently big n it has to hold det((3„(T^“) ^ 0. Hence 
Tn £ G for n sufficiently big and G is open. □ 


{A o 7/(r^i) 0 7 /(r^ 2 ) o A*)''“’'“» = 

(A o U{T^^) 0 7 /(T'" 2 ) o A*Y’^. 

for Aj and /ij sufficiently near, but different, to A 2 and pb 2 , 
respectively. By the continuity of t/ on 7, proved in Lemma 2, 
this implies that this equation has to hold also for Aj = A 2 
and P 2 = T 2 . Hence, (16) does not depend on the choice of 
Ai, A 2 and ko- 

Finally, the fact that /(T^i 0 o ; 7 (T^i) 0 = 

g(T^^ 0 T^ 2 j for any Ai,A 2 follows from (16) and (15). □ 

Lemma 4 There exists a E SE{2,N) such that U(T^) = 
T^{a). 

Proof By definition of U it holds that 


Before diving into the proofs of the other auxiliary lem¬ 
mata, we make the following observation. Let Ai,A 2 G 7 
be such that Ai -|- RhX 2 E 7 for all k E Zjv. Applying the 
Induction-Reduction theorem (9) to (11) yields 


0 7/(r^i+^‘^2)oA = Aot/(T^i)0[/(r^") VAi,A2y^0, 

k G 

Then, for any i,j, 7, k, 


A o U(T^^) 0 t/(T^ 2 ) o 
= © + -^2 ) — + -^2 ) _ 0 


k G 


k G 


(15) 


Lemma 2 The function X 1 —^ U(T^) is continuous on I. 


U{T^^)e,iU{T^=)i-k,j 


0 


a j = i-k, 
otherwise. 


(17) 


By invertibility of U (T^i), there exists io E Zjv such that 
U{T^'-)opg ^ 0. Lfsing (17) this implies that =0 

for any j ^ io — k. Namely, we have proved that there exists 
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a family of functions ip-k : <S —>■ C such that = 

or, equivalently, that 

(7(T^) = diagfc<^fe(A) S*». 

By the explicit expression (4) of , in order to complete the 
proof it suffices to prove that (^fe(A) = for some 

Xq G R^. 

By continuity and unitarity of U, the <pfe’s are continuous 
and satisfy |(^^(A)| = 1. Using again (17) with j = to — k, we 
obtain 

</’t(Ai +-RfeA 2 ) = (Ai)(/3£_fe(A2), (18) 

for any Ai, A 2 7 ^ 0 and £, k G 'Zn■ 

We claim that the tpe’s are characters of R^. Indeed, let 
us fix fe = 0 in (18): 

7 >^(Ai + A 2 ) = </J^(Ai)(^^(A 2 ). (19) 

Choosing A 2 = —Ai in the above shows that can be ex¬ 
tended at 0. Moreover, letting Ai = 0 and taking the limit 
A 2 ^ 0 shows that this extension is continuous. Since char¬ 
acters of R^ are exactly the continuous functions satisfying 
(19), the claim is proved. 

By Pontryiagin duality, there exists Xi G R^ such that 
. Finally, by (18) with k G Zjv one obtains 
that R-kXi = x^-fs, which proves that there exists 3:0 G R^ 
such that v^f(A) = This completes the proof of 

the statement. □ 


C Proofs 

Proof (Formula (12)) Let A G 5 and consider v G C^. Ob¬ 
serve that (x,k)~^ = (—R-i^x,—k). Then, by (5), (2), and 
(4), for any h E Zn we have 

(£{T^)-v)h 

N-l 

= JC-{x,k)e-^<^’^'‘-’‘^KH-kdx 

k = 0 

^ f‘ f _ 

= '^ Vh-h / / f(y)'F{R-h(y - dydx 

k=o 
iV-l 

= y^vn-k / <Piz)f{y)e-^<^'‘->‘^’y--> dydz 

k = 0 

AT-l 

= <P{Rh£^) ^ Vh-k f(Rk-h^) 

k = 0 

By definition of cjip(A)* ® aj/(A)* this completes the proof. 

□ 

In order to prove Theorem 3 we need the following explicit 
description of the equivalence in the Induction-Reduction the¬ 
orem of Section 3.1. 

Lemma 5 For any M,N G we have 


Proof Observe that for any v G it holds (M ® 

Af).v = M o V o N'^. Thus, 

JV-l 

[A o (M ® Ar).v]fe(i) = ^ Mij Ni-k,iv{j,£). 

j,t = 0 

Since is straightforward to check that A~^ : ^ 

£Nxn Jg given by [A~''^{we)e^Zf,]{k, h) = Wk-h{k), we then 
have 

[A o (M ® A) o A~''-.(wh)he2.N]k{‘i') 

N-l 

— ^ ^ ^i — k,£Wj—£(j) 

j,e=o 

N-l 

j,h=0 

By (8), this completes the proof. □ 

Proof (Proof of Theorem 3) Without loss of generality we can 
restrict ourselves to consider functions such that (ff = / and 
Ccf = Cf. We start by the trivial remark that the result on 
the rotational descriptors contains the one on the generalized 
ones. Let us consider 

l2’\f) ■■= {^f{RHX),Wf{\)}, 

(^^(fl^Ai) ©C./(A2 ).^/(Ai + A2)>. 

Since / is assumed to be compactly supported, its Fourier 
transform is analytic, and so are the functions A 1 — I^’^if) 
and (Al, A 2 ) M- (f) for any h G Thus, tliG statG- 

ment of the proposition tgcIucgs to show that RPS/:/ = RPS^g 
(resp. RBcf = RBcg) if and only if I^’^if) = 

(resp. J^^’^^’^(/) = l 2 ^'^^'^(g)) for a.e. A,Ai,A 2 G S and 
all h E Zn 

Let us recall the following properties of the tensor prod¬ 
uct, valid for all v,Vi,V 2 ,'w,wi,W 2 G : 

1. (v ® w)* = u ® «), 

2. (vi lS> Wl) O {v 2 (3) W 2 ) = {wi,V 2 ) Vl (gl W 2 , 

By these and (12), we immediately have 

RPS£^/(A,h) = {uj^f{RhX)*,Lj^f(X)*}io^(Rh\)* (guj^iX)* 

= I^’^{f)uj^(RhX)* <gLj^{X)*. 

Hence, whenever uj^(RhX)* (gLj^{X)* ^ 0, RPS/;^/(A,h) = 
RPS/;,,g (A, h) if and only if I^'^if) = ^ 2 ’^(s)- Since ui^ (R/iA)*® 
0 Jt[,(X)* ^ 0 if and only if uj^(X) ^ 0, by the fact that F ElZ 
this is true for a.e. X E S. This completes the proof of the part 
of the statement regarding the rotational power-spectrum. 

To prove the statement regarding the rotational bispec¬ 
trum, let Bf = A o RBSr;/(Ai, A 2 , /i) o A~^, where A is the 
equivalence given by the Induction-Reduction Theorem and 
defined in (10). Since A is invertible, determining 
RBSi;/ (Al, A 2 , h) is equivalent to determining B /. Exploiting 
the fact that the r.h.s. of (9) is a diagonal matrix we have 

= (Ao£/(T«'‘^')®£/(T^i)oA-i)'“’^ 


(Ao(M® A)oA-i)''’'‘ = (Mij Ni_k,j-h)i,jeZM- 
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By Lemma 5, formula (12), and explicit computations, we 
then get 

= {ujf{Rh^i) 0 A2), + Re^2)) 

X (lj®-( i?^Ai)* 0 uj^{Rh+k^2)*) 0 (Ai + -Rf A2). 

Similarly to before, {u!^{RhXi)* © A 2 )*) 0 aJij/(Ai + 

ReX 2 ) ^ 0 for a.e. Ai, A 2 G 5 since W ETZ. For these couples, 
Bf =Bg it and only if 

{ujf{RhXi) © ujf{ReX2),u>f{Xi + ReX2)) 

— {^g{RhXi) © ujg{ReX2), ‘^g(Ai + -R^A 2 )). 

Finally, making the change of variables Rr A 2 1 —A 2 completes 
the proof of the theorem. □ 
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